Summer Olympic Games Analysis from 1896 to 2020

Team One: Dan Zhang, Daniel Zhou, Lu Li

October 20, 2021

Introduction

The Olympic Games are one of the most important international sports festivals, which began in 1896 in Greece, and since then they have been staged every fourth year (Benagh, J., 2019). The Olympic Games is more than just a sport competition, because they have brought the world community together to promote peace, culture communication, and understanding through the athletics. The Olympic Games have also inspired people by its motto, which is “Faster, Higher, Stronger” (Douglas, P.S., 2004).

The Olympic Games also benefit the host countries as the Games will increase the host country’s global stature and create the sense of national pride. According to a global poll, a majority of people in 18 of 21 countries stated their nations’ performance at the Olympics was “important to their national pride,” (Procon.org., 2020). People are proud of their home country when hosting the Olympic Games and when winning the Games. Thus, in this project, the countries which have hosted the Olympic, and the number of the Olympic Games they have hosted will be analyzed. The relationship between being host country and the total number of medals of the country will also be investigated. Besides, significant political events, such as the World War, also have effects on the Olympic Games. Therefore, the total medals of the countries changed over time will be observed in order to figure out the effect of significant political events on the Olympic Games.

Guiding Questions

(1) Which countries have hosted the most Olympics? Which continents have hosted the most Olympics?

(2) Is there any relationship between being the host country and the number of medals one country won?

(3) Which are the top countries based on the medals they have won? And is the medals tally related to the ecomony of the coutry?

(4) Does political events affect the Olympic Games?

Through our guiding questions, we hope to see how the Olympics have interacted with different parts of the world, how global events have influenced the Olympics over the years.

Dataset

The dataset (Tanoeiro, 2021) that we will be using comes from Kaggle.com. It is a website of open data sources that was recommended by the course. We are free to use the dataset for this project as it has been contributed to CC0, which means that a person has dedicated the dataset to the public domain, by waiving all his/her rights to the work worldwide under copyright law. (Creative Commons Corporation, 2021).

The dataset we are using is a structured dataset and contains information on the medal statistics from all the summer Olympics that have happened from 1896 to 2020. It is stored in CSV format.

There are 8 columns total:

Analysis

Data Wrangling

Our data is already structured, so our data wrangling was not particularly intensive. Here is a short summary of the steps we did.

Q1 Which countries have hosted the most Olympics? Which continents have hosted the most Olympics?

We will first do a bar plot to see who the hosts are and their number of games hosted.

As we can see from the bar plot:

One can also argue that Germany has also hosted 2 Olympics if Germany and (the now defunct) West Germany are counted as one nation state.

We next plot out the host countries on a map to examine where these host countries are situated.

As we can see from the map, most of the host countries are situated in the northern hemisphere. The only 2 host countries from the southern hemisphere are Brazil and Australia. This makes sense as majority of Earth’s landmass is in the northern hemisphere.

So what about continent? We know that the United States has hosted the most Olympics, but does that mean North America as a continent is also leading the way in the number of Olympics hosted? We will plot the host countries again, but this time we will group them by continent.

As we can see from the sunburst plot:

Lets plot it out on a map to see if there are any continents that we missed.

From our map, we can see that Africa and Antarctica has not hosted any Olympic games

As we can see, a vast majority of games are hosted in the European continent. This is not surprising as Europe is known to be an economic powerhouse, it is also convenient to coordinate the games there as the International Olympic Committee is situated in Europe (Switzerland) as well.

Lastly, we will plot an interactive timeline of host countries, so we can check the selection of host countries over time.

As we move the slider to earlier years, we can see that the Olympics was mostly an European and United States hosted event, which is as expected based on our previous analysis of host countries and host continents, its not until recent years that the Olympics started to spread out the hosting duties more to different continents.

Here are the years of the first Olympic games hosted on each continent in chronological order

So in conclusion for this guiding question, in terms of countries, the United States has hosted the most Olympic games, but in terms of continents, Europe has hosted the most Olympic games.

Q2 Is there any relationship between being the host country and the number of medals one country won?

Hosting the Olympic Games is an incredible honor for a country. Hosting the Olympics not only will boost the economic, cultural, environmental, tourism and sense of national pride, but also will have higher performance in terms of medals (Asgari, B. and Khorshidi, R., 2013). In order to further analyze this point, the host countries from 1986 to 2020 were compared with the best non-host-countries in those years regarding the medals. The table and bar chart below list the number of golds, silver, bronze and total medals of the host countries and the best non-host-countries in that year. From the bar chart, we can see that in the years of 1986, 1900, 1904, 1908, 1932, 1936, 1980, 1984, 1996 and 2008, the performance of host countries were obviously better than those non-host countries. However, in other years, the best countries of the non-host-countries won more medals. So, it is hard to draw the conclusion if we simply compare the host country with the best of the non-host-country in that year, because the performance of one country could be affected by many factors, and being the host country is just one factor. In order to eliminate the other factors, we also compared the medals numbers of one country when it was the host with the average medals numbers when it was not the host country by radar charts.

The radar charts present the medals numbers of a country when it is the host in orange color. The blue color indicates the average medal numbers when it is not the host. It is very obvious that there are totally 29 Olympic Games from 1896 to 2020, and in most cases, one country won more medals when it is the host country, with only three exceptions, which were Great Britain in 1948, West Germany in 1972 and Canada in 1976. Actually, the performance of the host country West Germany (1972) was similar to when it was not the host country. Therefor, being host country can improve the average medals tally.

The country's performance at the Olympics is related to the size of the economy. In general, the bigger the size of a country's economy, the more the medals. Of course, there are a few exceptions such as Kenya, Ethiopia and Ukraine. But during the Tokyo Olympics, 80% of the top 10 counties in the medals tally are high-income countries (Sen, S.,2021). In order to further observe the relationship between the performance in Olympic and the economy of the country, the top 20 countries have been selected based on the total medals they have won from 1986 to 2020.

First, a pie chart has been made to show the partial of each country in the accumulated medal tally from 1986 to 2020. The top 20 countries have been listed while others have been collected together. It can be observed that the top 1 country in the medal tally is the United States, which occupied 15.7%. It is also widely known that United States is the largest economies in the world. According to World Population Review, the top 10 countries by GDP in 2020 are: United States (GDP: 20.49 trillion), China (GDP: 13.4 trillion), Japan (GDP: 4.97 trillion), Germany (GDP: 4.97 trillion), United Kingdom (GDP: 2.83 trillion), France (GDP: 2.78 trillion), India (GDP: 2.72 trillion), Italy (GDP: 2.07 trillion), Brazil (GDP: 1.87 trillion) and Canada (GDP: 1.71 trillion)(World Population Review,2020). From the pie chart, if we do not consider Soviet Union as it has been disintegrated in 1991, the top 10 countries based on the medal tally are United States, Great Britain, France, Germany, China, Italy, Australia, Hungary, Sweden, Japan. So, 7 of those 10 countries are also in the top 10 by GDP, which are United States, Great Britain, France, Germany, China, Italy, Japan.

The bar charts have been applied to present the accumulated gold, silver, and bronze medals of the top 20 countries from 1986 to 2020 separately. For the gold metals, the United States has won 1060 gold metals from 1986 to 2020, which is even two times more than the 2nd country (Soviet Union, 395 gold metals). The same trends can be also observed in silver and bronze medals. Therefore, United States is the top 1 country regarding all the three medals. Therefore, we can get the conclusion that the medals tally of one country is related to its economy size. Here, we only consider GDP as an indicator for the economy size of a country, and future research could focus on more economy parameters.

Q4 Does political events affect the Olympic Games?

Rule 50 of the Olympic Charter reads: “No kind of demonstration or political, religious or racial propaganda is permitted in any Olympic sites, venues or other areas.” (IOC Athletes’ Commission, 2021)

But politics have disrupted the Olympic Games throughout its history, whether through boycotts, propaganda, or protests. Here we will analyze the effect of some of these political events.

The 1916, 1940 and 1944 Olympic Games that were scheduled were all cancelled (Cancelled Olympic Games - Wikipedia, 2021) due to World War I and World War II.

Talking about World War I and World War II, one country jumps out right away, Germany. So we will plot and analyze Germany's medal number.

Pre-World War 2, many historians believed that Hitler saw the 1936 Games as an opportunity to promote his government and ideals of racial supremacy (Large, 2007), this is illustrated by the fact that the Germans won the most medals overall in that particular Olympics.

We see from the plot that Germany has competed in all the Games except 1920, 1924 and 1948. Germany were not permitted to go to those particular Olympics, as a consequence of the wars.

United Team of Germany was a combined team of athletes from West Germany and East Germany that competed in the 1956, 1960, and 1964 Olympic Games.

From 1968 until the end of the Cold War, the two nation states sent independent teams designated as West and East Germany, this lasted all the way until the reunification of Germany in 1992. (Germany at the Olympics - Wikipedia, 2021)

Next we will plot and analyse the Soviet Unions data.

In 1980, The Soviet Union won the most gold and overall medals. Together with East Germany (which was administered and occupied by Soviet forces following the end of World War II) they won 127 out of 203 available gold medals.

Led by the United States, 66 countries boycotted the games entirely, for 65 out of the 66 countries, the official reasoning was because of the Soviet–Afghan War(U.S. Department of State, 2010) . The Soviet Union and its allies later boycotted the 1984 Summer Olympics in the United States (Los Angeles).

1988 was the last Olympics that the Soviet Union officially participated in, as the dissolution of Soviet Union happened in 1991, before the 1992 Olympic Games.

The United States medal count is plotted and analyzed next, earlier in the analysis, we found out about the 1984 boycott involved 14 Eastern Bloc countries and allies, led by the Soviet Union. (Infoplease, 2006). This led directly to United States winning by far the most medals in that particular Olympics.

However, that's not the record for most medals won in a single Olympics game, the United States has that record from an earlier game, with 234 total medals in the 1904 Olympic games. That's because tensions caused by the Russo–Japanese War and difficulties in traveling to St. Louis in 1904 resulted in very few top-class athletes from outside the US and Canada taking part in the 1904 Games. Only 62 out of the 651 athletes who competed came from outside North America, and 523 out of the 651 athletes are from the United States, so a lot of events had just US athletes competing in them. (Mallon, 2011)

Lets wrap up now by doing a follow up to our question 3, where we found the countries with the most medals, lets now take the top countries in terms of medal count, and plot them out to analyze any trends we see.

We can see from the above plot, that the United States is consistently at the top of the medal table, with only the former Soviet Union able to consistently challenge it for top spot on the medal table during the Cold War period.

We can also see that the earlier Olympics are dominated by the host country, with France, US, and UK topping the medal table by a long shot in 1900, 1904, and 1908 respectively. This shows that the Olympics has yet to gain traction worldwide in it's early stages.

Several significant global political events can be prominently seen on the medal table, including the nazi Germany era of the 1930s, the Cold War period of early 1950s to late 1980s, the dissolution of Soviet Union in early 1990s, and the rise of China as an economic powerhouse in the late 1990s and early 2000s.

One interesting observation to note is that the United Kingdom is slowly re-immerging up the medal table starting in the early 2000s, the pace of ascension up the medal table is similar to China's, however, with UK, there are no related political event in the same time period that coincides with this, and no obvious economic data to support this rise either as the UK's GDP growth from 2000 to 2020 of 163% (The World Bank Group, 2021a) is slightly behind global GDG growth of 252% (The World Bank Group, 2021b). Through our research, we conclude that this is most likely due to UK's own sports funding program, which has seen funding increase 466% from 2000 to 2020 (UK Sport, 2021). This shows that although global politics play a large role in a country's results at Olympics, government prioritization and support for it's sports programs is also very important to a country's success at the Olympic games.

Lastly, we end off our analysis with a little fun plot for Canada!

As you can see in the plot above. Canada and the United States medal count plots mirror eachother almost perfectly. We also checked Mexico (due to proximity) and other countries (with close political ties, like the United Kingdom), and we weren't able to find the same relationship in their medal count plots.

We really are America's little brother! (just kidding! but not really...)

Conclusion

(1) United States is the country that has hosted the most Olympics, Europe is the continent that has hosted the most Olympics. Africa and Antarctica are the only continents that have never hosted the Olympics.

(2) Being host country can improve the average medals tally.

(3) The medals tally of one country is related to its economony size.

(4) The Olympics strives to not be political, but from the history we could see the politics can affect the Olympics games, both in terms of participating countries, and medal counts. Government's prioritization and support for it's sports programs is very important to a country's success at the Olympic games.

References